Spectral Learning of Binomial HMMs for DNA Methylation Data
نویسندگان
چکیده
We consider learning parameters of Binomial Hidden Markov Models, which may be used to model DNA methylation data. The standard algorithm for the problem is EM, which is computationally expensive for sequences of the scale of the mammalian genome. Recently developed spectral algorithms can learn parameters of latent variable models via tensor decomposition, and are highly efficient for large data. However, these methods have only been applied to categorial HMMs, and the main challenge is how to extend them to Binomial HMMs while still retaining computational efficiency. We address this challenge by introducing a new feature-map based approach that exploits specific properties of Binomial HMMs. We provide theoretical performance guarantees for our algorithm and evaluate it on real DNA methylation data.
منابع مشابه
Hilbert Space Embeddings of Hidden Markov Models
Hidden Markov Models (HMMs) are important tools for modeling sequence data. However, they are restricted to discrete latent states, and are largely restricted to Gaussian and discrete observations. And, learning algorithms for HMMs have predominantly relied on local search heuristics, with the exception of spectral methods such as those described below. We propose a nonparametric HMM that exten...
متن کاملPredicting CpG Islands and DNA Methlation in the Cow Genome Using DNA Microarray Meta-Analysis and Genome Wide Scanning
DNA methylation is a type of epigenetic changes that directly affects DNA. In mammals, DNA methylation is essential for fetal development and stem cell differentiation and this phenomenon essentially occurs within the CpG islands. In this study, two methods were used to study the DNA methylation profile of cow genome. In the first method, the DNA methylation profile of the differentially expres...
متن کاملImplementing spectral methods for hidden Markov models with real-valued emissions
Hidden Markov models (HMMs) are widely used statistical models for modeling sequential data. The parameter estimation for HMMs from time series data is an important learning problem. The predominant methods for parameter estimation are based on local search heuristics, most notably the expectation–maximization (EM) algorithm. These methods are prone to local optima and oftentimes suffer from hi...
متن کاملThe role and importance of DNA methylation in spermatogenesis process
Background: DNA methylation is one of the epigenetic marks that are created by de novo DNA methylation and be maintained through cell division. This process is catalyzed by DNA methyltransferases. DNA methylation establishment in germ line is important, since they have the potential to regulate gene expression in offspring and improper DNA methylation patterns in germ lines has serious conseque...
متن کاملA Stochastic Model for the Formation of Spatial Methylation Patterns
DNA methylation is an epigenetic mechanism whose important role in development has been widely recognized. This epigenetic modification results in heritable changes in gene expression not encoded by the DNA sequence. The underlying mechanisms controlling DNA methylation are only partly understood and recently different mechanistic models of enzyme activities responsible for DNA methylation have...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.02498 شماره
صفحات -
تاریخ انتشار 2018